Goto

Collaborating Authors

 contextual search










Density-Based Algorithms for Corruption-Robust Contextual Search and Convex Optimization

Leme, Renato Paes, Podimata, Chara, Schneider, Jon

arXiv.org Artificial Intelligence

We study the problem of contextual search, a generalization of binary search in higher dimensions, in the adversarial noise model. Let $d$ be the dimension of the problem, $T$ be the time horizon and $C$ be the total amount of adversarial noise in the system. We focus on the $\epsilon$-ball and the absolute loss. For the $\epsilon$-ball loss, we give a tight regret bound of $O(C + d \log(1/\epsilon))$ improving over the $O(d^3 \log(1/\epsilon) \log^2(T) + C \log(T) \log(1/\epsilon))$ bound of Krishnamurthy et al (Operations Research '23). For the absolute loss, we give an efficient algorithm with regret $O(C+d \log T)$. To tackle the absolute loss case, we study the more general setting of Corruption-Robust Convex Optimization with Subgradient feedback, which is of independent interest. Our techniques are a significant departure from prior approaches. Specifically, we keep track of density functions over the candidate target vectors instead of a knowledge set consisting of the candidate target vectors consistent with the feedback obtained.


Automating Bibliometric Analysis with Sentence Transformers and Retrieval-Augmented Generation (RAG): A Pilot Study in Semantic and Contextual Search for Customized Literature Characterization for High-Impact Urban Research

Xu, Haowen, Li, Xueping, Tupayachi, Jose, Jianming, null, Lian, null, Omitaomu, Femi

arXiv.org Artificial Intelligence

Bibliometric analysis is essential for understanding research trends, scope, and impact in urban science, especially in high-impact journals, such Nature Portfolios. However, traditional methods, relying on keyword searches and basic NLP techniques, often fail to uncover valuable insights not explicitly stated in article titles or keywords. These approaches are unable to perform semantic searches and contextual understanding, limiting their effectiveness in classifying topics and characterizing studies. In this paper, we address these limitations by leveraging Generative AI models, specifically transformers and Retrieval-Augmented Generation (RAG), to automate and enhance bibliometric analysis. We developed a technical workflow that integrates a vector database, Sentence Transformers, a Gaussian Mixture Model (GMM), Retrieval Agent, and Large Language Models (LLMs) to enable contextual search, topic ranking, and characterization of research using customized prompt templates. A pilot study analyzing 223 urban science-related articles published in Nature Communications over the past decade highlights the effectiveness of our approach in generating insightful summary statistics on the quality, scope, and characteristics of papers in high-impact journals. This study introduces a new paradigm for enhancing bibliometric analysis and knowledge retrieval in urban research, positioning an AI agent as a powerful tool for advancing research evaluation and understanding.